-
-
Notifications
You must be signed in to change notification settings - Fork 617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated precision_recall_curve.py #2490
Conversation
@sayantan1410 Thanks ! I left a few comments. |
@sdesrozis Corrected !! Please check. |
Good ! Although, the issue is not solved. Actually, the tests should be refactored as cohen kappa ones. You can see the difference, it misses ddp tests. Could you try do improve the tests ? |
Yes, Sure I can do that, no problem !! |
Thanks ! It should be very similar to tests/ignite/contrib/metrics/test_cohen_kappa.py ! |
@vfdev-5 I have updated the epoch_metric, so that it can broadcast tuples. Let me know what to correct. |
@sayantan1410 Thanks for the update. You should modify the tests to validate your improvement. See this file tests/ignite/metrics/test_epoch_metric.py |
Okay will check it out. |
Hey, Can you tell why all the tests are not done when autopep8 makes a commit. |
I don't know. Did you apply all the tools (as blake, flake, etc.) before pushing ? Anyway, have a look here https://github.com/pytorch/ignite/runs/5441042415?check_suite_focus=true It seems that a few things does not work fine in the code. Please format correctly the code and push. |
It is asking me to add a blank line between line 162 and 163 in test_precision_recall_curve.py but that has been added by the autopep8 commit just after that, but I cannot figure out how to run the check for that commit. The problem raised as I haven't formatted the code with lint before pushing it. |
@sdesrozis @vfdev-5 The windows, TPU and macos tests are successful in distributed environment but the ubuntu and hvd tests are failing, can you help me with this ? |
@sayantan1410 I think your tests are failing somewhere. The test Did you try to launch the parallel tests on your own machine ? |
I pushed a fix (I hope). I used safe mode of Let's see. |
@sayantan1410 Could you fix the tests ? Results are not equal to sklearn because tensors were not converted to numpy in asserts. Thanks ! |
Yes, and the distributed tests have passed as well locally in CPU, but I cannot run the tests on GPU as I do not have GPU in my machine.
|
Thank you so much for the help, I will surely do that. |
You didn't run distributed tests because they didn't work. No GPU needed. In the tests folder, you must run the run_cpu_tests.py locally instead of pytest. Explore the script to catch how run distributed tests 😉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sayantan1410 You did it ! Thanks a lot for this work !
@sayantan1410 @sdesrozis there is an issue with this PR, check this GPU CI job: https://app.circleci.com/pipelines/github/pytorch/ignite/2510/workflows/ced3dff0-05c4-4c92-b695-9c325b48af93/jobs/7720
|
I'm on it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sayantan1410 thanks for the PR and sorry for delay. Please send another one to fix my review comments.
|
||
def compute(self) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]: | ||
if len(self._predictions) < 1 or len(self._targets) < 1: | ||
raise NotComputableError("EpochMetric must have at least one example before it can be computed.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- "EpochMetric must have ..."
+ "PrecisionRecallCurve must have ..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixing it soon.
precision = torch.Tensor(precision) | ||
recall = torch.Tensor(recall) | ||
# thresholds can have negative strides, not compatible with torch tensors | ||
# https://discuss.pytorch.org/t/negative-strides-in-tensor-error/134287/2 | ||
thresholds = torch.Tensor(thresholds.copy()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tensor creation should be done with torch.tensor
and not torch.Tensor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, will change it.
@sdesrozis Thank you for all the help, got to learn a lot !! |
|
||
def _test_distrib_integration(device): | ||
|
||
rank = idist.get_rank() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test case does not use rank and each process generates some random preds and true values per distributed process. PrecisionRecallCurve
implementation should gather all data from all processes but it is checked against local process computation:
np_y_true = y_true.cpu().numpy().ravel()
np_y_preds = y_preds.cpu().numpy().ravel()
sk_precision, sk_recall, sk_thresholds = precision_recall_curve(np_y_true, np_y_preds)
I would assume this to fail but looks like it is passing. @sayantan1410 can you check why it is so ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah sur will check it !!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the error comes from the following test which seems wrong (and need a fix too)
def _test_distrib_integration(device): |
@sayantan1410 Have a look to this correct implementation
def _test(n_epochs, metric_device): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sdesrozis Will check it soon !!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sdesrozis A small question,
def _test(n_epochs, metric_device): |
Here, also we don't have something like idist.all_gather so how is the data getting gathered from all the processes ?
Fixes #1284
Updated the file precision_recall_curve.py file.
For testing I have ran the tests in the tests/ignite/contrib/metrics/test_precision_recall_curve.py and all of them passed but haven't separately tested in distributed environment.
It could contain some issues though.